fix: raise ValueError when combat batch has fewer than 2 cells#3994
Merged
flying-sheep merged 3 commits intoMar 14, 2026
Conversation
Previously, `sc.pp.combat` would silently produce NaN values when a batch contained only 1 cell, because within-batch variance cannot be estimated from a single observation. This adds input validation to raise a clear error before computation begins. Closes scverse#1175
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3994 +/- ##
==========================================
+ Coverage 77.96% 77.97% +0.01%
==========================================
Files 118 118
Lines 12517 12523 +6
==========================================
+ Hits 9759 9765 +6
Misses 2758 2758
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Contributor
Author
|
Hi, just checking if this PR is ready for review. CI is green and tests pass. Happy to address any feedback! |
Member
|
OK, I simplified this a bit by reusing the groupby object, looks great! |
Contributor
Author
|
Thanks for the simplification — reusing the groupby dict is much cleaner! |
meeseeksmachine
pushed a commit
to meeseeksmachine/scanpy
that referenced
this pull request
Mar 14, 2026
… fewer than 2 cells
flying-sheep
pushed a commit
that referenced
this pull request
Mar 14, 2026
… batch has fewer than 2 cells) (#4001) Co-authored-by: LiudengZhang <99156394+LiudengZhang@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Raise a
ValueErrorwhensc.pp.combatreceives a batch with fewer than 2 cells.Why
ComBat estimates within-batch variance, which requires at least 2 observations.
When a batch contains only 1 cell, the variance is zero, causing a divide-by-zero
that silently fills the entire corrected matrix with NaN. Downstream functions
like
highly_variable_genesthen fail with cryptic errors.How
Added validation after batch grouping that checks all batch sizes ≥ 2. Raises a
clear error listing the offending batch names, so users know which batches to
filter before running combat.
Testing
Added
test_combat_single_cell_batch— creates a dataset with a single-cellbatch and confirms the
ValueErroris raised. All 5 existing combat testscontinue to pass.
Closes #1175